A Comparative Study of Minimally Supervised Morphological Segmentation
نویسندگان
چکیده
This article presents a comparative study of a subfield of morphology learning referred to as minimally supervised morphological segmentation. In morphological segmentation, word forms are segmented into morphs, the surface forms of morphemes. In the minimally supervised data-driven learning setting, segmentation models are learned from a small number of manually annotated word forms and a large set of unannotated word forms. In addition to providing a literature survey on published methods, we present an in-depth empirical comparison on three diverse model families, including a detailed error analysis. Based on the literature survey, we conclude that the existing methodology contains substantial work on generative morph lexicon-based approaches and methods based on discriminative boundary detection. As for which approach has been more successful, both the previous work and the empirical evaluation presented here strongly imply that the current state of the art is yielded by the discriminative boundary detection methodology.
منابع مشابه
A Comparative Study on Minimally-Supervised Morphological Segmentation
This article presents a comparative study on a sub-field of morphology learning referred to as minimally-supervised morphological segmentation. In morphological segmentation, word forms are segmented into morphs, the surface forms of morphemes. In the minimally-supervised datadriven learning setting, segmentation models are learned from a small amount of manually annotated word forms and a larg...
متن کاملMinimally-Supervised Morphological Segmentation using Adaptor Grammars
This paper explores the use of Adaptor Grammars, a nonparametric Bayesian modelling framework, for minimally supervised morphological segmentation. We compare three training methods: unsupervised training, semisupervised training, and a novel model selection method. In the model selection method, we train unsupervised Adaptor Grammars using an over-articulated metagrammar, then use a small labe...
متن کاملComparing minimally supervised home-based and closely supervised gym-based exercise programs in weight reduction and insulin resistance after bariatric surgery: A randomized clinical trial
Background: Effectiveness of various exercise protocols in weight reduction after bariatric surgery has not been sufficiently explored in the literature. Thus, in the present study, we aimed at comparing the effect of minimally supervised home-based and closely supervised gym-based exercise programs on weight reduction and insulin resistance after bariatric surgery. &n...
متن کاملPainless Semi-Supervised Morphological Segmentation using Conditional Random Fields
We discuss data-driven morphological segmentation, in which word forms are segmented into morphs, that is the surface forms of morphemes. We extend a recent segmentation approach based on conditional random fields from purely supervised to semi-supervised learning by exploiting available unsupervised segmentation techniques. We integrate the unsupervised techniques into the conditional random f...
متن کاملSemi-supervised Learning for Mongolian Morphological Segmentation
Unlike previous Mongolian morphological segmentation methods based on large labeled training data or complicated rules concluded by linguists, we explore a novel semi-supervised method for a practical application, i.e., statistical machine translation (SMT), based on a low-resource learning setting, in which a small amount of labeled data and large amount of unlabeled data are available. First,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 42 شماره
صفحات -
تاریخ انتشار 2016